Cross-lingual Ontology Alignment using EuroWordNet and Wikipedia

نویسنده

  • Gosse Bouma
چکیده

This paper describes a system for linking the thesaurus of the Netherlands Institute for Sound and Vision to English WordNet and dbpedia. The thesaurus contains subject (concept) terms, and names of persons locations, and miscalleneous names. We used EuroWordNet, a multilingual wordnet, and Dutch Wikipedia as intermediaries for the two alignments. EuroWordNet covers most of the subject terms in the thesaurus, but the organization of the cross-lingual links makes selection of the most appropriate English target term almost impossible. Precision and recall of the automatic alignment with WordNet for subject terms is 0.59. Using page titles, redirects, disambiguation pages, and anchor text harvested from Dutch Wikipedia gives reasonable performance on subject terms and geographical terms. Many person and miscalleneous names in the thesaurus could not be located in (Dutch or English) Wikipedia. Precision for miscellaneous names, subjects, persons and locations for the alignment with Wikipedia ranges from 0.63 to 0.94, while recall for subject

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-lingual Dutch to English alignment using EuroWordNet and Dutch Wikipedia

This paper describes a system for linking the thesaurus of the Netherlands Institute for Sound and Vision to English WordNet and dbpedia. We used EuroWordNet, a multilingual wordnet, and Dutch Wikipedia as intermediaries for the two alignments. EuroWordNet covers most of the subject terms in the thesaurus, but the organization of the cross-lingual links makes selection of the most appropriate E...

متن کامل

Applying Wikipedia's Multilingual Knowledge to Cross-Lingual Question Answering

The application of the multilingual knowledge encoded in Wikipedia to an open–domain Cross–Lingual Question Answering system based on the Inter Lingual Index (ILI) module of EuroWordNet is proposed and evaluated. This strategy overcomes the problems due to ILI’s low coverage on proper nouns (Named Entities). Moreover, as these are open class words (highly changing), using a community–based up– ...

متن کامل

Multilingual Design Of EuroWordNet

This paper discusses the design of the EuroWordNet database, in which semantic databases like WordNetl.5 for several languages are combined via a so-called inter-lingual-index. In this database, language-independent data is shared and language-specific properties are maintained as well. A special interface has been developed to compare the semantic configurations across languages and to track d...

متن کامل

Cross-lingual Alignment and Completion of Wikipedia Templates

For many languages, the size of Wikipedia is an order of magnitude smaller than the English Wikipedia. We present a method for cross-lingual alignment of template and infobox attributes in Wikipedia. The alignment is used to add and complete templates and infoboxes in one language with information derived from Wikipedia in another language. We show that alignment between English and Dutch Wikip...

متن کامل

Monolingual and cross-lingual ontology matching with CIDER-CL: evaluation report for OAEI 2013

CIDER-CL is the evolution of CIDER, a schema-based ontology alignment system. Its algorithm compares each pair of ontology entities by analysing their similarity at different levels of their ontological context (linguistic description, superterms, subterms, related terms, etc.). Then, such elementary similarities are combined by means of artificial neural networks. In its current version, CIDER...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010